Trans-EZ at NTCIR-2 : Synset Co-occurrence Method for English-Chinese Cross-Lingual Information Retrieval
نویسندگان
چکیده
In this paper, a new method for English-Chinese cross-lingual information retrieval is proposed and evaluated in NTCIR-II project. We use the bilingual resources and contextual information to deal with the word sense disambiguation (WSD) and translation disambiguation for query translation. An EnglishChinese WordNet and a synset co-occurrence model are adopted to solve the problem of word sense ambiguity. And the translation ambiguity and target polysemy are also resolved using such co-occurrence relationship of synsets. The experimental results are discussed to analyze the effects of ambiguity in source language and target language.
منابع مشابه
Description of the NTU Japanese-English Cross-Lingual Information Retrieval System
This paper describes a Japanese-English Cross-Language Information Retrieval (CLIR) System for the evaluation in NTCIR project. We extend our work on Chinese-English CLIR to deal with this problem. Several query translation strategies, including select-all, select-first, and co-occurrence models by different corpora, and several query generation strategies, including topic, description, narrati...
متن کاملEnglish-Japanese Cross-lingual Query Expansion Using Random Indexing of Aligned Bilingual Text Data
Vector space models can be used for extracting semantically similar words from the co-occurrence statistics of words in large text data. In this paper, we report on our NTCIR 2002 experiments using the Random Indexing vector space method for extracting an English-Japanese cross-lingual thesaurus from aligned English-Japanese bilingual data. The crosslingual thesaurus has been used for automatic...
متن کاملOverview of CLIR Task at the Fifth NTCIR Workshop
The purpose of this paper is to overview research efforts at the NTCIR-5 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from over ten countries are ...
متن کاملOverview of CLIR Task at the Sixth NTCIR Workshop
The purpose of this paper is to overview research efforts at the NTCIR-6 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from ten countries or region...
متن کاملKorean-Chinese Cross-Language Information Retrieval Based on Extension of Dictionaries and Transliteration
This paper describes our Korean-Chinese cross-language information retrieval system. Our system uses a bi-lingual dictionary to perform query translation. We expand our bilingual dictionary by extracting words and their translations from the Wikipedia site, an online encyclopedia. To resolve the problem of translating Western people’s names into Chinese, we propose a transliteration mapping met...
متن کامل